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NETWORK ROUTING IN A DYNAMIC ENVIRONMENT^^ 

By Nozer D. Singpurwalla 
George Washington University 

O^ , Recently, there has been an explosion of work on network routing 

in hostile environments. Hostile environments tend to be dynamic, 
and the motivation for this work stems from the scenario of lED 
placements by insurgents in a logistical network. For discussion, we 

lO ' consider here a sub-network abstracted from a real network, and pro- 

Cn ^ pose a framework for route selection. What distinguishes our work 

from related work is its decision theoretic foundation, and statisti- 

n , ' cal considerations pertaining to probability assessments. The latter 

entails the fusion of data from diverse sources, modeling the socio- 
psychological behavior of adversaries, and likelihood functions that 
are induced by simulation. This paper demonstrates the role of statis- 
tical inference and data analysis on problems that have traditionally 
belonged in the domain of computer science, communications, trans- 
portation science, and operations research. 
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^<\ I 1. Introduction: Background and overview. Network routing problems 

'^ ' involve the selection of a pathway from a source to a sink in a network. Net- 

,j. , work routing is encountered in logistics, communications, the internet, mis- 

• I sion planning for unmanned aerial vehicles, telecommunications, and trans- 

f— ^ ■ portation, wherein the cost effective and safe movement of goods, person- 

nel, or information is the driving consideration. In transportation science 
and operations research, network routing goes under the label vehicle rout- 
ing problem (VRP); see Bertsimas and Simchi-Levi (1996) for a survey. The 
flow of any commodity within a network is hampered by the failure of one or 
^ ' more pathways that connect any two nodes. Pathway failures could be due to 

natural and physical causes, or due to the capricious actions of an adversary. 
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Fig. 1. Subnetwork for transportation from A to I. 

For example, a cyber-attack on the internet, or the placement of an impro- 
vised explosive device (lED) on a pathway by an insurgent. Generally, the 
occurrence of all types of failures is taken to be probabilistic. See, for exam- 
ple, Gilbert (1959), or Savla, Temple and Frazzoli (2008) who assume that 
the placement of mines in a region can be described by a spatio-temporal 
Poisson process. 

The traditional approach in network routing assumes that the failure 
probabilities are fixed for all time, and known; see, for example, Colburn 
(1987). Modern approaches recognize that networks operate in dynamic en- 
vironments which cause the failure probabilities to be dynamic. Dynamic 
probabilities are the manifestations of new information, updated knowledge, 
or new developments (circumstances); de Vries, Roefs and Theunissen (2007) 
articulate this matter for unmanned aerial vehicles. 

The work described here is motivated by the placement of lED's on the 
pathways of a logistical network; see Figure 1. Our aim is to prescribe an 
optimal course of action that a decision maker D is to take vis-a-vis choosing 
a route from the source to the sink. By optimal action we mean selecting 
that route which is both cost effective and safe. P's efforts are hampered 
by the actions of an adversary A, who unknown to T>, may place lED's 
in the pathways of the network. In military logistics, A is an insurgent; in 
cyber security, ^ is a hacker. P's uncertainty about lED presence on a par- 
ticular route is encapsulated by Vs personal probability, and P's actions 
determined by a judicious combination of probabilities and P's utilities. 
For an interesting discussion on a military planner's attitude to risk, see 
de Vries, Roefs and Theunissen (2007) who claim that individuals tend to 
be risk prone when the information presented is in terms of losses, and risk 
averse when it is in terms of gains. Methods for a meaningful assessment 
of P's utilities are not on the agenda of this paper; our focus is on an as- 
sessment of P's probabilities, and the unconventional statistical issues that 
such assessments spawn. 

To cast this paper in the context of recent work in route selection under 
dynamic probabilities, we cite Ye et al. (2010) who consider minefield detec- 
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tion and clearing. For these authors, dynamic probabihties are a consequence 
of improved estimation as detection sensors get close to their targets. The 
focus of their work is otherwise different from the decision theoretic focus of 
ours. 

We suppose that P is a coherent Bayesian and thus an expected utility 
maximizer; see Lindley (1985). This point of view has been questioned by de 
Vries, Roefs and Theunissen (2007) who claim that humans use heuristics 
to make decisions. The procedures we endeavor to prescribe are on behalf 
of D. We do not simultaneously model ^'s actions, which is what would be 
done by game theorists. Rather, our appreciation of ^'s actions are encap- 
sulated via likelihood functions, and modeling socio-psychological behavior 
via subjectively specified likelihoods is a novel feature of this paper. Fien- 
berg and Thomas (2010) give a nice survey of the diverse aspects of network 
routing dating from the 1950s, covering the spectrum of probabilistic, statis- 
tical, operations research, and computer science literatures. In Thomas and 
Fienberg (2010) an approach more comprehensive than that of this paper 
is proposed; their approach casts the problem in the framework of social 
network analysis, generalized linear models, and expert testimonies. 

1.1. Overview of the paper. We start Section 2 by presenting a subnet- 
work, which is part of a real logistical network in Iraq, and some lED data ex- 
perienced by this subnetwork. For security reasons, we are unable to present 
the entire network and do not have access to all its lED experience. Section 3 
pertains to the decision-theoretic aspects of optimal route selection. We dis- 
cuss both the nonsequential and the sequential protocols. The latter raises 
probabilistic issues, pertaining to the "Principle of Conditionalization," that 
appear to have been overlooked by the network analyses communities. The 
material of Section 3 constitutes the general architecture upon which the 
material of Section 4 rests. Section 4 is about the inferential and statis- 
tical matters that the architecture of Section 3 raises. It pertains to the 
dynamic assessment of failure probabilities, and describes an approach for 
the integration of data from multiple sources. Such data help encapsulate 
the actions of A, and P's efforts to defeat them. The approach of Section 4 
is Bayesian; it entails the use of logistic regression and an unusual way of 
constructing the necessary likelihood functions. Section 5 summarizes the 
paper, and portrays the manner in which the various pieces of Sections 3 
and 4 fit together. Section 5 also closes the paper by showing the workings 
of our approach on the network of Section 2. 

2. A network for transportation logistics. Figure 1 is a subnetwork ab- 
stracted from a real logistics network used in Iraq. The subnetwork has nine 
nodes, labeled A (not to be confused with adversary A) to I, and ten links, 
labeled 1 to 10. The source node is A and the sink node is I. 
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Table 1 
Historical data on lED placements on 12 bridges m Iraq 
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There are thirteen bridges dispersed over the ten hnks of Figure 1, with 
hnk 9 having one bridge, the "new bridge." This bridge is a mile away from a 
park, the old city, the bus station, and the mosque. The precise locations of 
the remaining 12 bridges in the subnetwork are classified. There have been 
four crossings on the "new bridge," and none of these have experienced an 
lED attack. To plan an optimal route from source to sink, T> needs to know 
the probability of experiencing an lED attack on the next crossing on each 
of the ten links. However, we focus discussion on link 9, because it is for this 
link that we have information on the number of previous crossings. 

To assess the required probabilities, we need to have all possible kinds of 
information, including that given in Table 1, which gives the history of lED 
placements on the remaining twelve bridges of the subnetwork. The data of 
Table 1, though public, were painstakingly generated via information from 
multiple sources — such as Google Maps — by the so-called process of "con- 
necting the dots." Generally, such data are hard to come by via the public 
domain. The recently released WikiLeaks (2010) data has some covariate in- 
formation on lED experiences in Afghanistan. However, there are very few 
well-defined logistical routes in Afghanistan, and those that may be there 
are not identified in the WikiLeaks database. Furthermore, the covariate 
information that is available is not of the kind relevant to route selection. 
Thus, for this paper, the WikiLeaks- Afghanistan data are of marginal value. 

In Table 1, the column labeled "Attack" is 1 whenever the bridge has 
experienced an attack; otherwise it is 0. The other columns give the distance 
of the bridge, in miles, from population centers like a park, old city, bus 
station, and mosque. An entry of zero denotes that the bridge is next to 
the landmark. Whereas data on lED attacks tends to be public (because of 



NETWORK ROUTING 5 

press reports), data on the number of crossings by convoys, the number of 
lEDs cleared, the composition of the convoys, etc., remains classified. 

The three routes suggested by Figure 1 are as follows: (1,2,3,4,5,6,7,8), 
(1,2,3,4,10), and (1,2,9). Since lEDs are placed by adversaries, V is gener- 
ally uncertain of their presence when planning begins. Additionally, there are 
pros and cons with each route in terms of distance traversed, route conditions 
(such as the number of curves and bends, terrain topology), proximity to 
hostile territory, receptiveness of the local population to harbor insurgents, 
and so on. In actuality, T> will have access to historical data of the type 
shown in Table 1, and also information about the nature of the cargo, the 
convoy speed, intelligence about the cunningness and sophistication of the 
insurgents, the number of previous unencumbered crossings on a link, etc. 

P's problem is to select an optimal route between the three routes given 
above. A variant is to specify the optimal route sequentially. That is, start 
by going from A to C via links 1 and 2, and then, upon arrival at C, make 
a decision to proceed along link 9 to the sink, or to take the circuitous routes 
via the links 3 to 8, and 10, to get to the sink. Similarly, upon arrival at 
node E, V could proceed along link 10, or via the links 5,6,7, and 8 to 
arrive at the sink. P's decision as to which choice to make will be based on 
P's uncertainty of lED presence on the links 3 to 10, assessed when T) is at 
node C and at node E. 

Thus, optimal route selection is a problem of decision under uncertainty. 
Because of the dynamic environment in which convoys operate, P's un- 
certainties change over time. In Section 3 we prescribe a decision-theoretic 
architecture for route selection. This requires that T) assess his (her) uncer- 
tainties about lED placements, as well as utilities for a successful or failed 
traversal. Since P's uncertainties are dynamic, the prescription of Section 3 
is also dynamic; that is, the selected route is optimal only for an upcoming 
trip. The main challenge therefore is an assessment of the dynamic proba- 
bilities; see Section 4. 

3. ■P's decision-theoretic architecture. Under the nonsequential proto- 
col, P needs to choose, at decision time, from the following: Di = take route 
(1, 2, 9); Z'2 = take route (1, 2, 3, 4, 5, 6, 7, 8); or D^, = take route (1, 2, 3, 4, 10). 
Figure 2 shows P's decision tree for these choices, with each Di leading to 
a random node i?j, with each Ri leading to an outcome S (for success) and 
F (for failure), i = 1,2. Here S is the event that an lED is not encountered 
on any link of the route, and F the event that an lED is encountered. If T) is 
aware of any route clearing activity, then this becomes a part of P's covari- 
ates used to assess probabilities. The presence of an lED does not necessarily 
imply an explosion. Unexploded lEDs cause disruptions, and P's aim is to 
choose that route which minimizes the risk of damage and disruption. 
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Fig. 2. "D's decision tree for nonsequential actions. 

In Figure 2, pi{S) and pi{F) = 1 — pi{S) denote P's probabilities for 
success and failure, and U{Di,S) and U{Di,F), Vs utilities under Di. The 
quantities P2{S), P2{F), U{D2,S), and U{D2,F) pertain to D2; similarly, 
for D3. 

Assessing utilities is a substantive task [cf. Singpurwalla (2010)] entailing 
rewards, penalties, and attitudes to risk. This task is not pursued here. 
However, one often assumes binary loss functions, so that U{Di,S) = 1 and 
U{D„F) = 0. 

Per the principle of maximization of expected utility, D chooses that Di 
for which the expected utility is a maximum. Thus, at each Ri, T> computes, 
for i = 1,2,3, 

E[U{Di)]=pi{S)U{D,,S)+p^{F)U{Di,F), 

and chooses that Di which maximizes 'E[U{Di)]. 

3.1. V 's assessm,ent ofpi(S). The building blocks ofpi{S) are the p{j)'s, 
P's probabilities of an lED placement on link j, j = 1,...,10. Under ac- 
tion Di , the event S will occur at the terminus of the tree if there is no lED 
placement on the links 1, 2, and 9. If E{j) denotes the event that an lED 
is placed on link j, then p{j) is an abbreviation for P[E{j)). If P assumes 
that the E{j)^s, j = 1,2,9, are independent, then 

pi(5) = (l-p(l))(l-p(2))(l-p(9)) and pi(F) = 1 -pi(5); 

otherwise, 

pi{F) =p(l) +p(2) +p(9) -p(l,2) -p(l,9) -p(2,9) +p(l,2,9), 

where p{j,k) is P's joint probability that both E{j) and E{k) occur, j ^ k; 
similarly with p{j,k,l), j ^k ^l. If p{j\k) denotes P's conditional proba- 
bility of E{j) given E{k), and if V judges E{j) independent of E{1), given 
E{k), then 

V{3.k,l)=p{3\k)p{k\l)p{l). 
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Conditional independence in networks is often invoked when dependence 
between E{j) and E{k) matters only when links j and k are neighbors. Since 
links 1 and 9 are not neighbors, V may judge E{1) and E{9) independent 
given E{2). 

"D's main task is to assess the probabilities of the type p{j) and p{j\k). 
The material of Section 4 pertains to this exercise. 

3.2. Decision making under a sequential protocol. Here, T> starts with 
a single choice, namely, getting to node C via links 1 and 2, and then, upon 
arriving at C, making one of two choices: get to the sink via link 9, or via 
the links 3 through 8, and 10. With three choices, the decision tree for the 
sequential protocol will be analogous to that of Figure 2, save for the fact 
that the decision nodes will be at nodes C and E, instead of being at node 
A. The rest of the analysis parallels that described in the material following 
Figure 2 [cf. Singpurwalla (2009)], save for one matter, namely, the caveat 
of conditionalization. 

3.2.1. The caveat of conditionalization. The principle of conditionaliza- 
tion (POC) pertains to probability assessments of two (or more) events, 
and the disposition of one of them becomes known [cf. Singpurwalla (2006), 
page 21, and (2007)]. It arises because conditional probabilities are in the 
subjunctive mood. When the disposition of the conditioning event becomes 
known, and the POC is upheld, the probability of the unconditioned event 
is its previously assessed conditional probability. When the POC is not up- 
held, one assesses the probability of the unconditioned event via a likelihood 
and Bayes' Law, using the revealed value of the conditioned event as data. 
When sequential routing is done for strategic reasons, socio-psychological 
issues come into play, and then it is realistic to assess the probability of the 
unconditioned event via a likelihood. 

To illustrate the above, consider the scenario of P choosing a sequential 
protocol, and having arrived at node C needs to assess the quantities ^2,9(5") 
and P2,3iS), where ^2,9(5') is the probability of successfully arriving at the 
sink via links 2 and 9. If the POC is upheld, then ^2,9(5") is obtained as 
P{E''{9)\E''{2)); £^^(2) is the probability of no lED presence on link 2. If the 
POC is not upheld, then 

P2,9(.S) = P{E'{9y,E'{2))^C{E'{9y,E'{2)){l-p{9)), 

where the middle term is P's likelihood of an lED absence on link 9, under 
the sure knowledge of an lED absence on link 2. Similarly with P2,3iS). 

The likelihood is specified by P and is the price to be paid for rejecting 
the POC. Such likelihoods may encapsulate the socio-psychological consid- 
erations that T> chooses to exercise. Since the likelihood is a weight that T> 
assigns to a prior probability, D may upgrade (downgrade) the prior via the 
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likelihood depending on whether the absence of an lED on link 2 would make 
the presence of an lED on link 9 more (or less) likely. Here much depends 
on what P thinks of the abilities and resources of insurgents. 

4. Dynamic assessment of link probabilities. By link probabilities, we 
mean unconditional probabilities of the type p{j),j = 1,...,10. By a dy- 
namic assessment, we mean an updating of each p{j) due to additional in- 
formation that can come in the form of hard data, expert testimonies, socio- 
psychological considerations, or new covariate information. The updating of 
a p{j) can come into play at any time, most often at the commencement of 
each route scheduling session, or in the case of sequential routing, at any 
time during the cycle at an intermediate node. In what follows, we focus 
on link j, and discuss the assessment of p{j). A dynamic assessment of the 
conditional probabilities p{j\k) is discussed in Section 4.4. 

Factors that influence any p{j) would be covariates such as route to- 
pography (the number of bends, curves, bridges, and surface conditions), 
convoy size and composition (materials or humans), convoy speed, time of 
transport (day or night), weather conditions, political climate, etc. A second 
factor would be historical data on lED placements on link j, and on all the 
other links in the region. Finally, also relevant would be P's subjective view 
about p{j), encapsulated via a prior. 

4.1. Notation and terminology. Let (X = 1) denote the event that one or 
more lEDs are placed on link j; {X = 1) is a proxy for E{j), and P{X = 1) 
a proxy for p{j)- To avoid cumbersome notation, we will not endow X with 
the index j. Let Zi, . . . ,Zk be k covariates that influence p{j), and denote 
these by the vector Z = {Zi, . . . , Zk); Z is assumed known to V. Suppose that 
there have been n crossings on link j, with Xm = 1(0) if the ?7ith crossing 
experienced (did not experience) an lED, m = l,. . . ,n. Let X = {Xi, . . . , Xn) 
denote the historical lED experience on link j. Assume that Xi = X2 = ■ ■ ■ = 
Xn = 0, or that Xi = X2 = ■ ■ ■ = Xn-i = 0, and that Xn = 1. That is, V has 
observed a series of n successes on link j, or has just experienced a failure. 
Motivation for these extreme cases is given later. 

The lED experience for the entire region is in matrix D, where 

Yi Zii • • • Zki 
Y2 Z12 • • • Zk2 



D 



Yi Zii 



K Z, 



Zki 



Zks 



In the /th row of D, Y/ = 1(0) if an lED presence has been encountered (not 
encountered) under condition Zn, . . . , Z^i, for I = 1, ... ,s. Thus, at disposal 
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to T> are the s lED related experiences in the region, and associated with each 
experience are the values of the k covariates that influence each experience. 
To avoid any duplicate weighting of data, X will not be a part of D. The 
motivation for excluding X from D is to give link j a special emphasis by 
incorporating the effect of X, which is specific to link j, in a vein that is 
different from D. 

Let Xi be the realization of Xj, and yi of YJ, i = 1, . . . , n and / = 1, . . . , s. 
Each Xi = 1 or 0; similarly, y^. D is assumed known to "D; its elements may 
not be controlled by P. 

D's task is to assess P{X = l;x, Z,D*), where x = (xi, . . . ,3;„), and D* 
is D with the y^'s replaced by y/, / = 1, . . . , s. The above expression is P's 
probability of an lED presence on link j, knowing x, Z, and D*. Assessing 
this probability is tantamount to fusing data from two sources: lED experi- 
ence on link j, and historical lED experience in the region wherein j resides. 
It is a form of weighting wherein one borrows strength based on individual 
and population characteristics. 

4.1.1. The proposed model. Start by assuming x unknown, so that P(X = 
l;x, Z,D*) is P[X = 1|X; Z,D*), and invoke the law of total probability to 
write 

P(X = 1|X;Z,D*)= I P(X = l|p,X;Z,D*)7r(p|X;Z,D*)(ip, 

where p is a propensity [see Singpurwalla (2006), page 50], and 7r(p|X; Z, D*) 
is P's uncertainty about p, given X, with Z and D* known. The propensity 
of event £ is the proportion of times £ occurs in an infinite number of trials. 
If we assume that, given p, the event {X = 1) is independent of X, Z, and 
D*, then 

(4.1) P(X = 1|X;Z,D*)= / p-7r(p|X;Z,D*)dp, 

Jo 

and by B ayes' Law, 

^(p|X; Z, D*) oc 7r(X|p; Z, D*) • 7r(p; Z, D*) 

= 7r(X|p)-7r(p;Z,D*), 

if given p, X is independent of Z and D*. Here 7r(p; Z, D*) is P's uncertainty 
about p in light of Z and D*, and 7r(X|p) is P's probability model for X. 
Equation (4.1) now becomes 

(4.2) P{X = 1|X; Z, D*) oc / p- ^(X|p) • 7r(p; Z, D*) dp. 

Jo 

However, X is observed as x, and, thus, a probability model for X does not 
make sense. We therefore write P{X = 1|X;Z,D*) as P{X = l;x,Z,D*), 
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and ■k(X.\p) as £(p;x), the likelihood of p under x. Now equation (4.2) 
becomes 

(4.3) P{X = 1; X, Z, D*) oc / p- C{p; x) • 7r(p; Z, D*) dp. 

Jo 

Equation (4.3) is our proposed model for assessing p{j)- To proceed, T> needs 
to specify the likelihood C{p;x) and 7r(p; Z,D*), the posterior of p. 

4.2. Psychological considerations in specifying likelihoods. The lED sce- 
nario entails special considerations for specifying C{p;x). These arise be- 
cause T> needs to incorporate an insurgent's socio-psychological behavior in 
the lED placement process, and also P's strategy for outfoxing the insurgent. 

Recall that with x = (0, . . . , 0) or x = (0, . . . , 0, 1), the conventional like- 
lihood of p would be C{p;x.) =p^^^{l — p)""^^% which for the aforemen- 
tioned X would be (1 —p)"' or (1 — p)""^ -p. The motivation for the conven- 
tional specification is that a preponderance of failures (i.e., non-IED place- 
ments) should decrease the propensity of an lED placement, and vice versa. 
However, the conventional approach, though appropriate for scenarios which 
are nonadversarial, is inappropriate for lED placement which embodies an 
adversary with a socio-psychological agenda. It seems that here a prepon- 
derance of failures should eventually increase the propensity of success. In- 
surgents are opportunistic adversaries who may allow a series of successful 
link crossings only to impart to P a sense of false security, while all the time 
preparing to do damage on the next crossing. Similarly, an astute T) would 
view the occurrence of a success that is preceded by a sequence of failures 
(i.e., non-IED placements) with much pessimism, as a dramatic change in 
the operating environment. Essentially, P would downgrade the impact of 
the observed sequence of (n — 1) failures and strongly weigh the impact of 
the last success. With the above behavioristic considerations, our proposed 
likelihood for p, for x = (xi, . . . , x„) fixed, is of the form 

C{p; x) = (1 - p) V'^-E^. . pE^. . 

When X = (0, . . . , 0), the above likelihood becomes 

(4.4) /:(p;x) = (l-p)^, 
and when x = (0, . . . , 0, 1), it is 

(4.5) C{p;^) = {l-p)"^^^^-p. 

As n — 7- oo, equation (4.4) tends to (1 — p), the conventional likelihood for 
a single Bernoulli trial that results in a failure. With n — )• oo, equation (4.5) 
tends to (1 — p) ■ p, the conventional likelihood for the case of two Bernoulli 
trials resulting in one failure and one success. In an adversarial context, this 
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1 Values of p 

Fig. 3. The likelihood of p as a function of n. 



is tantamount to P regarding a long series of failures as only a single failure 
(i.e., T> does not become complacent), and a long series of failures followed 
by a success as only one failure and one success. In this latter case, T> gives 
equal weight to the (n — 1) failures and the one success; that is, V becomes 
deeply concerned when the first success is observed. Figure 3 illustrates the 
likelihood. 

The proposed likelihood of p is in the envelope bounded by {1 — p) and 
(1 — p) ^^. Thus, after three successive failures V gives more and more weight 
to larger values of p, suggesting an absence of P's complacence with a long 
series of failures. The specification of the likelihoods as embodied in equa- 
tions (4.4) and (4.5) is a novel feature of this paper; it is a possible approach 
to adversarial modeling. 

4.3. T>'s assessment of the posterior 7r(p;Z,D*). An assessment of the 
posterior of p in the light of known covariates Z and the historical data D* 
is developed in two stages. The challenge here is with the specification of 
the likelihood. 

Stage I: Logistic regression for extracting the information in D*. In- 
formation provided by D* lies in an assessment of the posterior of /3 = 
(/3i, . . . ,/3i,. . . ,f3k), where /3 appears in a logistic regression model 

P{Yi = l;p,Zi 



1, 



, s, with Z 



1 + exp(- X;n=i Ziu/Su) 
Zii, . . . ,Zki)- Recall, Yi and Zi are the ^th row 



for / : 
of D*. 

Using standard but computationally intensive simulation procedures, we 
can obtain the posterior oi (3 in light of D*. Denote this posterior as '7r(/3; D*). 

Stage II: The likelihood of p under Z and D*. To assess the posterior 
7r(p;Z,D*), invoke Bayes' Law to write 



(4.6) 



7r(p;Z,D*)«£(p;Z,D*)^(p), 
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where £(p; Z,D*) is the hkehhood of p in hght of the known Z and D*, and 
tt{p) is P's prior for p. Note that p and Z are specific to hnk i, whereas 
D* is common to all the links of the network. The prior on p could be any 
suitable distribution, such as a beta distribution over (0, 1). The main theme 
of Stage II, however, is a development of the likelihood £(p;Z,D*). 

Whereas likelihoods may be subjectively specified, the conventional me- 
thod is to invert a probability model by juxtaposing the parameter(s) and 
the random variables. This is the strategy we use, but to do so we need 
a probability model for p with Z and D* as background information. Since p 
depends on Z, we denote this dependence by replacing p with p(Z). Thus, 
we seek a probability model for p(Z) with D* as a background, namely, 
P[p(Z);D*]. But knowing D* is equivalent to knowing f3 with its posterior 
probability, 7r(/3;D*), developed in Stage I. Thus, for f3 = f3* , [p{Z);f3*] has 
probability 7r(/3*;D*). However, per the logistic regression model, 



(Z);/31 



1 



l + exp(-E„^«/3,!) 



where /3* appears as the nth element of /3* = {fS^ , . . . , /3p . 

To summarize, the event [p(Z);/3*] = l/[l + exp(— ^ Zuf3^)] has probabil- 
ity 7r(/3*;D*), and this provides us with a probability model for [p(Z);D*]. 
Consequently, a plot of (p(Z);/3*) versus 7r(/3*;D*) provides the required 
likelihood function. 

To implement this idea, we sample a f3* from 7r(/3;D*) to obtain 

1 + exp(- ^„ ZuPl) 

and also 7r(/3*; D). A plot of {p{Z);f3*) versus 7r(/3*; D*) is then the likelihood 
function of p in light of Z and D*; see Figure 4. 

With 7r(p(Z)) the prior on p specified, and the likelihood >C(p;Z,D*) in- 
duced via a logistic regression model governing p(Z) and D*, the desired 




Likelihood 1 
£{p;Z,n') 



* Values oi' p 
[Le. (p(Z);p')] 

Fig. 4. The likelihood of p with Z and D* known. 
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posterior 

7r(p;Z,D*)oc£(p;Z,D*).7r(p(Z)) 

can be numerically assessed. 

Once the above is done, all the necessary ingredients for obtaining equa- 
tion (4.3), which can now be written as 

(4.7) P{X = l;^,Z,-D*)^lp-£{p;^)-£{p;Z,-D*)-7T{p)dp, 

are at hand. The above expression can be numerically evaluated. 

4.4. Dynamic assessment of conditional probabilities. For both the non- 
sequential and sequential protocols wherein the POC is upheld, we need to 
assess conditional probabilities of the type P(n|?TT.), where links ttt. and n 
are adjacent to each other, and traversing on m precedes that on n. There 
are two possible strategies. The first one is for D to subjectively change the 
assessed p{n) by either increasing it because an insurgent might find it easy 
to populate neighboring links with lEDs, or to decrease it if T> thinks that 
an insurgent has limited resources for placing lEDs. 

The second approach is less subjective because it incorporates data on 
lED placements or nonplacements on neighboring links. The idea here is 
to treat the conditioning event E{m) as a covariate, so that the vectors Z 
and (3 of Sections 4.1 and 4.3 get expanded by an additional term, as Z = 
(Zi, . . . , Zk, 1) and (3 = (/3i, . . . , /3fc, /3fc+i). Correspondingly, the matrix D of 
Section 4.1 also gets expanded to include an additional column whose /th 
term Zii.+i)i is 1 whenever there has been an lED experience in a preceding 
link; otherwise Zfk_^_i\i is 0. With the above in place, a repeat of the exercise 
described in Section 4.3 would enable a formal assessment of the conditional 
probabilities. The only other matter that remains to be addressed pertains 
to the likelihood of p as discussed in Section 4.2. Since the likelihood is 
a weight assigned to the posterior of p, P may either increase the C{p;x.) of 
equations (4.4) and (4.5), or decrease it depending on what P thinks of an 
insurgent's abilities and resources. D would increase C{p;x) if D feels that 
the insurgent's resources are plentiful; otherwise T> downgrades £(p;x). 

5. Summary and conclusions. Equation (4.7) shows how D can assess 
p{j), the probability of one or more lED placements on link j in a unified 
manner by a systematic application of the Bayesian approach. It entails 
a fusion of information on past lED experience on link j (encapsulated 
by X), historical data on lED experience in the region (encapsulated by 
the matrix D*), and P's subjective views about p{j), encapsulated via the 
likelihood £(p;x) and the prior 7r(p). The essence of equation (4.7) is that 
its right-hand side is the expected value of a weighted prior distribution 
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of p. The weighting of the prior is by the product of two hkehhoods, one 
reflecting historical lED experience specific to link z, and the other reflecting 
historical lED experience in the region as well as the relevant covariates 
specific to the forthcoming trip contemplated by T>. The entire development 
being grounded in the calculus of probability is therefore coherent. 

Though cumbersome to plough through, there are novel features to the 
two likelihoods. The first likelihood — equations (4.4) and (4.5) — is an un- 
conventional likelihood for use with Bernoulli trials. It is motivated by socio- 
psychological considerations attributed to both the insurgents who place the 
lED's, as well as to P, who does not become complacent upon a sequence of 
successful crossings and who upon the occurrence of the first failure adopts 
the posture of extreme caution. The second likelihood — that of Figure 4 — is 
induced in an unusual manner by leaning on the posterior distribution of 
the parameter vector of a logistic regression. 

The approach of Section 4 displays the manner in which information from 
different sources can be fused by decomposing the likelihood of p. Equation 
(4.7) shows this. The material of Section 4 feeds into that of Section 3 which 
pertains to sequential and nonsequential decision making under uncertainty. 

The computational and simulation work spawned by Section 4 entails 
logistic regression, generating /c-dimensional samples from the posterior dis- 
tribution of /3, numerically assessing 7r(p;Z,D*) — equation (4.6) — and nu- 
merical integration to obtain P[X = l;x, Z,D*) — equation (4.7). None of 
these pose any obstacles. Section 4.4 pertains to conditional probabilities. It 
expands on Sections 4.1 through 4.3, by treating the conditioning events as 
covariates. 

5.1. Data and information requirements. The one major obstacle per- 
tains to the paucity of the data for validating the approach. The required 
data, namely, x, Z, and D*, are available to the military logisticians, but 
are almost always classified. The WikiLeaks data tend to focus on lED ex- 
plosions and not on success stories wherein lED's get cleared, similarly with 
other publicly available data. Information that is relevant to constructing 
the likelihood based on socio-psychological considerations is highly individ- 
ualized, and perhaps not even recorded. It is desirable to collect this kind of 
information via experiments pertaining to the psychology of logisticians and 
route planners, and also insurgents via what is known as "red teaming." 

The text of this paper can be seen as a template for addressing network 
routing in a dynamic environment. The network architecture of Figure 1 
brings out the necessary caveats that problems of this type pose, one such 
caveat being the caveat of conditionalization, discussed in Section 3.2.1. Real 
logistical networks are more elaborate. In actual practice the matrix D* 
could have a very large dimension and thus be unmanageable. However, 
given the role that D* plays, one may simply sample from a high dimen- 
sional D* to work with a more manageable matrix. Besides a prior for p. 
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7r(p), all that is required of T> are the utilities mentioned in Section 3. How- 
ever, these utilities are proxies for costs, and no form of optimization can be 
achieved without cost considerations. Finally, this paper shows how statisti- 
cal methodologies can be constructively brought to bear in network routing 
problems which generically belong in the domain of computer science, net- 
work analysis, and operations research. 

We close this paper by illustrating in Section 5.1 the workings of Sections 3 
and 4 by using the data of Table 1 to assess the probability of encountering 
an lED on the next crossing on the "new bridge." 

5.2. The logistics network revisited. With respect to the network of Fig- 
ure 1, the data of Table 1 maps to the matrix D* of Section 4.1, with 
its column 2 corresponding to 1^,/ = 1, . . . , 12, column 3 corresponding to 
Zi^i, . . . , Zi^i2, and so on, with column 6 corresponding to ^4,1, . . . , ^4,12- 

A logistic regression model 

P{Yi = l;(3,Zi 



1 + exp(- Y.t=o ZuiPu) 

for / = 1, . . . , 12, with Zqi = 1, was fitted to the data of Table 1 using in- 
dependent Gaussian priors with means and standard deviations 10. This 
choice of priors is arbitrary. The joint posterior distribution of (/3o, • • • ,/34) 
was obtained via Gibbs sampling with 10,000 simulations after a burn-in of 
1,000 simulations. 

The marginal posterior distributions of /3o,/32) and /34 were symmetric 
looking, but those of /3i and /S^ were skewed to the left; plots of these dis- 
tributions are not shown. Table 2 compares posterior means against their 
maximum likelihood estimates, showing a good agreement between the two, 
save for /3o. 

About 60 samples from the joint posterior distribution of (/3o, . . . , P^) were 
generated, and for each sample, the quantity [1 -|-exp(— J2u=o (^u)]^^ com- 
puted. Here Z = (1, 1, 1, 1), suggesting that the next crossing is to be on the 
new bridge which is one mile away from all the four city centers of interest. 
Associated with each generated sample is also the probability of the sample; 
this is provided by the joint probability density. Figure 5 shows a plot of the 
computed quantity mentioned above [our (p(Z),/3*) of Section 4.3] versus 

Table 2 
Comparison of Bayes ' versus maximum likelihood estimates 



Approach 


/3o 


/3i 


/32 


/3. 


/34 


Bayes 

Maximum likelihood 


0.635 
1.811 


1.583 
1.817 


3.584 
3.299 


4.382 
4.402 


1.579 
1.311 
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Fig. 5. Monte Carlo induced likelihood function. 



the joint probability. A smoothed plot, smoothed by a moving average of 
five consecutive points, is the Monte Carlo induced likelihood. 

Since the new bridge has experienced 4 previous crossings and none 
of these crossings have experienced an lED attack, x= (0,0,0,0); thus, 
£(p;x) = (1 — py'^ , see equation (4.4). With the above in place, all the 
ingredients needed to compute P{X = l;x;Z,D*) — equation (4.7) — are at 
hand, save for 7r(p) the prior. Supposing tt{p) uniform on (0, 1), we have 



P(X = l;x;Z,D*)cx / p(l - p)^^£(p; Z,D* 
JO 



\dp, 



with i2(p;Z,D*) given by the likes of Figure 5. This can be numerically 
evaluated for a range of p, say, p = 0.05, 0.1, . . . , 0.95, 1, to obtain P{X = 1; 
x,Z,D*) oc 0.129. Similarly, we obtain P(X = 0;x,Z,D*) oc 0.293. The nor- 
malizing constant is 0.422, giving P{X = l;x,Z,D*) = 0.306 and P{X = 
0;x, Z,D*) = 0.694. Thus, the probability of encountering an lED on the 
next crossing on the "new bridge" is 0.306. 

5.2.1. Optimal route selection for logistical network. In order to pre- 
scribe an optimal route for the network of Figure 1, we need to calculate 
the probability of encountering an lED on each of the remaining 9 links 
of the network in a manner akin to that given above for link 9, the "new 
bridge." This requires that we have the vectors x and Z for each of these 
links, where x is the historic lED experience for a link, and Z is the vector 
of covariates associated with the links. This we do not have and are unable 
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to obtain for reasons of security. Consequently, and purely with the intent 
of illustrating how our decision theoretic framework can be put to work, we 
shall make some meaningful specifications about the p(i)'s, j = 1, . . . ,8, 10. 
These will be based on the relative lengths of each link, relative to the 
length of link 9 for which p(9) has been assessed as 0.306; that is, calibrate 
the required p(j)'s in terms of p(9). 

To do the above, we start by remarking that links 1 and 2 are of almost 
equal length, and are about two-thirds the length of link 9. Links 3 to 8 are 
of equal length and are about one-fifth the length of link 9, whereas link 10 
is about half the length of link 9. Note that Figure 1 is not drawn to scale. 
Thus, we set p(l) =p{2) = (0.66)(0.306) = 0.20, p(10) = (0.50)(0.306) = 0.15 
and p{3) = p{4) = p{5) = p{6) = p{7) = p{8) = (0.20) (0.306) = 0.06. These 
choices are purely illustrative; we could have used other methods of scaling 
such as the logarithmic or the square root. 

In addition to specifying the p{j)'s, we also need to specify utilities. For 
this we propose a utility function of the form 1 — n/x for a successful route 
traversal. Here n is the number of links in the route, and x is a constant 
which ensures that a successful traversal does not result in a negative utility. 
Specifically, the idea here is that a successful traversal yields a utility of one, 
but each link in the route contributes to a disutility to which is assigned 
a weight 1/x. Choice Di entails the route (1,2,9) and with x chosen to 
be 100, the utility of a successful traversal on this route will be 1 — 3/100. 
Similarly, the failure to achieve a successful traversal yields a utility of — 
n/x, yielding a negative utility of —n/x, which in the case of route (1,2,9) 
with X = 100 is -3/100. 

The above choices for utility do not take into consideration things such as 
composition of the convoys, traversal time, vicinity to hostile territory, costs 
of disruption, etc. With the above in place, and assuming independence of 
the lED placement events, it can be easily seen that the expected utilities 
of choices Z?i, D2, and D3 are 0.414, 0.361, and 0.430, respectively. Thus, 
for the given choices of probabilities and utilities, P's optimal route will be 
D3, which is (1, 2, 3, 4, 10). Observe that neither the shortest nor the longest 
routes are optimal. Sensitivity of P's final choice to values of x other than 
100 can be explored. For example, were x taken to be 10, then Di will turn 
out to be P's optimal choice. This is because it turns out the probability 
of a successful traversal via choices Di, D2, and D3 turns out to be rather 
close to each other, namely, 0.444, 0.441, and 0.480, respectively. 

This completes our discussion on illustrating the workings of the proposed 
approach vis-a-vis the network of Figure 1, and closes the paper. 
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